Self Checking Protocols: A Step towards Providing Fault Tolerance in Services

نویسنده

  • Gunjan Khanna
چکیده

In this paper we present ongoing work on building a detection and diagnosis system for Services architecture. We propose a hierarchical detection and diagnosis framework instantiated in a system called the Monitor. The Monitor verifies the messages exchanged between services against an anomaly based rule set. The Monitor architecture is application neutral making it generically applicable to a wide variety of web applications composed of services. Monitor treats service entities as black boxes making it non-intrusive. We provide a hierarchical framework which scales with the number of service entities. We extend the Monitor framework to provide diagnosis of failures through causal dependency modeling. The contribution of this work is a step towards building autonomic management system for a large class of web applications that are deployed today.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Developing Fault-Tolerant Control Systems Composed of Self-Checking Components in the Action Systems Formalism

It is widely recognized that a high degree of dependability of computer-based systems can be achieved if dependability consideration starts from the early stages of system development [10]. In this paper we propose an approach for incorporating means for fault-tolerance in the component-based system development. We discuss architecture of a fault-tolerant system based on a composition of so cal...

متن کامل

Combined On-Line/Off-Line Test Solutions for Digital Filters

A low-cost on-line test scheme for digital filters, capable of providing an off-line BIST solution, is proposed. The scheme utilizes an invariant of the digital filter in order to detect possible circuit malfunctioning on-line and shares most of this on-line checking hardware with off-line BIST. The analysis performed indicates that 100% fault secureness & 100% fault coverage are possible, if c...

متن کامل

Implementing Fail-Silent Nodes for Distributed Systems

A fail-silent node is a self-checking node that either functions correctly or stops functioning after an internal failure is detected. Such a node can be constructed from a number of conventional processors. In a software-implemented fail-silent node, the non-faulty processors of the node need to execute message order and comparison protocols to 'keep in step' and check each other respectively....

متن کامل

Deliberative Reasoning in Software Health Management

Rising software complexity in aerospace systems makes them very difficult to analyze and prepare for all possible fault scenarios at design-time. Therefore, classical run-time fault-tolerance techniques, such as self-checking pairs and triple modular redundancy are used. However, several recent incidents have made it clear that existing software fault tolerance techniques alone are not sufficie...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007